Self-organization in mixture densities of HMM based speech recognition

نویسنده

Mikko Kurimo

چکیده

In this paper experiments are presented to apply Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) for training mixture density hidden Markov models (HMMs) in automatic speech recognition. The decoding of spoken words into text is made using speaker dependent, but vocabulary and context independent phoneme HMMs. Each HMM has a set of states and the output density of each state is a unique mixture of the Gaussian densities. The mixture densities are trained by segmental versions of SOM and LVQ3. SOM is applied to initialize and smooth the mixture densities and LVQ3 to simply and robustly decrease recognition errors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixture trees - hierarchically tied mixture densities for modeling HMM emission probabilities

We propose a novel hierarchical mixture model and present its application to acoustic modeling for HMM based large vocabulary conversational speech recognition. We detail an EM algorithm for estimating the parameters of such a mixture tree for the case of Gaussian component densities. We sketch how clustering algorithms can be applied to automatically construct suitable mixture trees for a larg...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Using the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs

This paper presents methods to improve the probability density estimation in hidden Markov models for phoneme recognition by exploiting the Self-Organizing Map (SOM) algorithm. The advantage of using the SOM is based on the created approximative topology between the mixture densities by training the Gaussian mean vectors used as the kernel centers by the SOM algorithm. The topology makes the ne...

متن کامل

Principal mixture speaker adaptation for improved continuous speech recognition

Nowadays, almost all speaker-independent (SI) speech recognition systems use CDHMM with multivariate mixture Gaussian as observation density to cover speaker variabilities. It has been shown that given sufficient training data, the more mixtures are used in the HMM observation density, the better the system’s perform. However, acoustic HMM with more Gaussian densities is more complex and slows ...

متن کامل

Options for Modelling Temporal Statistical Dependencies in an Acoustic Model for ASR

In this paper we consider the combination of hidden Markov models based on Gaussian mixture densities (GMM-HMM) and linear dynamic models (LDM) as the acoustic model for automatic speech recognition systems. In doing so, the individual strengths of both models, i.e. the modelling of long-term temporal dependencies by the GMM-HMM and the direct modelling of statistical dependencies between conse...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Self-organization in mixture densities of HMM based speech recognition

نویسنده

چکیده

منابع مشابه

Mixture trees - hierarchically tied mixture densities for modeling HMM emission probabilities

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Using the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs

Principal mixture speaker adaptation for improved continuous speech recognition

Options for Modelling Temporal Statistical Dependencies in an Acoustic Model for ASR

عنوان ژورنال:

اشتراک گذاری